scientific research
Scientific Document Retrieval using Multi-level Aspect-based Queries
In scientific research, the ability to effectively retrieve relevant documents based on complex, multifaceted queries is critical. Existing evaluation datasets for this task are limited, primarily due to the high costs and effort required to annotate resources that effectively represent complex queries. To address this, we propose a novel task, $\textbf{S}$cientific $\textbf{Do}$cument $\textbf{R}$etrieval using $\textbf{M}$ulti-level $\textbf{A}$spect-based qu$\textbf{E}$ries (DORIS-MAE), which is designed to handle the complex nature of user queries in scientific research. We developed a benchmark dataset within the field of computer science, consisting of 100 human-authored complex query cases. For each complex query, we assembled a collection of 100 relevant documents and produced annotated relevance scores for ranking them.
ReplicationBench: Can AI Agents Replicate Astrophysics Research Papers?
Ye, Christine, Yuan, Sihan, Cooray, Suchetha, Dillmann, Steven, Roque, Ian L. V., Baron, Dalya, Frank, Philipp, Martin-Alvarez, Sergio, Koblischke, Nolan, Qu, Frank J, Yang, Diyi, Wechsler, Risa, Ciuca, Ioana
Frontier AI agents show increasing promise as scientific research assistants, and may eventually be useful for extended, open-ended research workflows. However, in order to use agents for novel research, we must first assess the underlying faithfulness and correctness of their work. To evaluate agents as research assistants, we introduce ReplicationBench, an evaluation framework that tests whether agents can replicate entire research papers drawn from the astrophysics literature. Astrophysics, where research relies heavily on archival data and computational study while requiring little real-world experimentation, is a particularly useful testbed for AI agents in scientific research. We split each paper into tasks which require agents to replicate the paper's core contributions, including the experimental setup, derivations, data analysis, and codebase. Each task is co-developed with the original paper authors and targets a key scientific result, enabling objective evaluation of both faithfulness (adherence to original methods) and correctness (technical accuracy of results). ReplicationBench is extremely challenging for current frontier language models: even the best-performing language models score under 20%. We analyze ReplicationBench trajectories in collaboration with domain experts and find a rich, diverse set of failure modes for agents in scientific research. ReplicationBench establishes the first benchmark of paper-scale, expert-validated astrophysics research tasks, reveals insights about agent performance generalizable to other domains of data-driven science, and provides a scalable framework for measuring AI agents' reliability in scientific research.
- North America > Canada > Ontario > Toronto (0.14)
- South America > Uruguay > Maldonado > Maldonado (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- (4 more...)
3 common alcohol myths, debunked
Breakthroughs, discoveries, and DIY tips sent every weekday. Humans have a long history with alcohol--we've been making and consuming it for over ten thousand years, about as long as we've had agriculture. That's a long time for people to come up with all kinds of ideas about the drug and how it works. So, not surprisingly, some of them are wrong. Here are a few common myths about alcohol, debunked by scientific research.
- Europe > United Kingdom (0.05)
- Asia > Middle East > UAE > Dubai Emirate > Dubai (0.05)
- Asia > Middle East > Jordan (0.05)
- Asia > Japan (0.05)
PLLuM: A Family of Polish Large Language Models
Kocoń, Jan, Piasecki, Maciej, Janz, Arkadiusz, Ferdinan, Teddy, Radliński, Łukasz, Koptyra, Bartłomiej, Oleksy, Marcin, Woźniak, Stanisław, Walkowiak, Paweł, Wojtasik, Konrad, Moska, Julia, Naskręt, Tomasz, Walkowiak, Bartosz, Gniewkowski, Mateusz, Szyc, Kamil, Motyka, Dawid, Banach, Dawid, Dalasiński, Jonatan, Rudnicka, Ewa, Alberski, Bartłomiej, Walkowiak, Tomasz, Szczęsny, Aleksander, Markiewicz, Maciej, Bernaś, Tomasz, Mazur, Hubert, Żyta, Kamil, Tykierko, Mateusz, Chodak, Grzegorz, Kajdanowicz, Tomasz, Kazienko, Przemysław, Karlińska, Agnieszka, Seweryn, Karolina, Kołos, Anna, Chrabąszcz, Maciej, Lorenc, Katarzyna, Krasnodębska, Aleksandra, Wilczek, Artur, Dziewulska, Katarzyna, Betscher, Paula, Cieślińska, Zofia, Kowol, Katarzyna, Mikoś, Daria, Trzciński, Maciej, Krutul, Dawid, Kozłowski, Marek, Dadas, Sławomir, Poświata, Rafał, Perełkiewicz, Michał, Grębowiec, Małgorzata, Kazuła, Maciej, Białas, Marcin, Roszko, Roman, Roszko, Danuta, Vaičenonienė, Jurgita, Utka, Andrius, Levchuk, Paweł, Kowalski, Paweł, Prawdzic-Jankowska, Irena, Ogrodniczuk, Maciej, Borys, Monika, Bulińska, Anna, Gumienna, Wiktoria, Kieraś, Witold, Komosińska, Dorota, Krasnowska-Kieraś, Katarzyna, Kobyliński, Łukasz, Lewandowska, Martyna, Łaziński, Marek, Łątkowski, Mikołaj, Mastalerz, Dawid, Milewicz, Beata, Mykowiecka, Agnieszka Anna, Peljak-Łapińska, Angelika, Penno, Sandra, Przybysz, Zuzanna, Rudolf, Michał, Rybak, Piotr, Saputa, Karolina, Tomaszewska, Aleksandra, Wawer, Aleksander, Woliński, Marcin, Wołoszyn, Joanna, Wróblewska, Alina, Żuk, Bartosz, Żarnecki, Filip, Kaczyński, Konrad, Cichosz, Anna, Deckert, Zuzanna, Garnys, Monika, Grabarczyk, Izabela, Janowski, Wojciech, Karasińska, Sylwia, Kujawiak, Aleksandra, Misztela, Piotr, Szymańska, Maria, Walkusz, Karolina, Siek, Igor, Kwiatkowski, Jakub, Pęzik, Piotr
Large Language Models (LLMs) play a central role in modern artificial intelligence, yet their development has been primarily focused on English, resulting in limited support for other languages. We present PLLuM (Polish Large Language Model), the largest open-source family of foundation models tailored specifically for the Polish language. Developed by a consortium of major Polish research institutions, PLLuM addresses the need for high-quality, transparent, and culturally relevant language models beyond the English-centric commercial landscape. We describe the development process, including the construction of a new 140-billion-token Polish text corpus for pre-training, a 77k custom instructions dataset, and a 100k preference optimization dataset. A key component is a Responsible AI framework that incorporates strict data governance and a hybrid module for output correction and safety filtering. We detail the models' architecture, training procedures, and alignment techniques for both base and instruction-tuned variants, and demonstrate their utility in a downstream task within public administration. By releasing these models publicly, PLLuM aims to foster open research and strengthen sovereign AI technologies in Poland.
- Europe > Poland > Lower Silesia Province > Wroclaw (0.04)
- Europe > Poland > Łódź Province > Łódź (0.04)
- Asia > Middle East > Jordan (0.04)
- (21 more...)
- Overview (1.00)
- Research Report > New Finding (0.92)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.92)
25 years of research in space
MIT astronauts aboard the International Space Station--and the MIT researchers who have sent up experiments--have advanced our understanding of science, space, and the universe. This image of the International Space Station and space shuttle Endeavour, flying at an altitude of approximately 350 kilometers, was taken by Expedition 27 crew member Paolo Nespoli from the Soyuz TMA-20 on May 24, 2011. On November 2, 2000, NASA astronaut Bill Shepherd, OCE '78, SM '78, and Russian cosmonauts Sergei Krikalev and Yuri Gidzenko made history as their Soyuz spacecraft docked with the International Space Station. The event marked the start of 25 years of continuous human presence in space aboard the ISS--a prolific period for space research. MIT-trained astronauts, scientists, and engineers have played integral roles in all aspects of the station's design, assembly, operations, and scientific research. One of MIT's most experienced NASA astronauts, Mike Fincke '89, is celebrating that milestone from space.
- Asia > Russia (0.28)
- North America > United States > Massachusetts (0.04)
- North America > Canada (0.04)
- (3 more...)
- Government > Space Agency (1.00)
- Government > Regional Government > North America Government > United States Government (1.00)
PETLP: A Privacy-by-Design Pipeline for Social Media Data in AI Research
Oh, Nick, Vrakas, Giorgos D., Brooke, Siân J. M., Morinière, Sasha, Duke, Toju
We introduce PETLP (Privacy-by-design Extract, Transform, Load, and Present), a compliance framework that embeds legal safeguards directly into extended ETL pipelines. Central to PETLP is treating Data Protection Impact Assessments as living documents that evolve from preregistration through dissemination. Through systematic Red-dit analysis, we demonstrate how extraction rights fundamentally differ between qualifying research organisations (who can invoke DSM Article 3 to override platform restrictions) and commercial entities (bound by terms of service), whilst GDPR obligations apply universally. We demonstrate why true anonymisation remains unachievable for social media data and expose the legal gap between permitted dataset creation and uncertain model distribution. By structuring compliance decisions into practical workflows and simplifying institutional data management plans, PETLP enables researchers to navigate regulatory complexity with confidence, bridging the gap between legal requirements and research practice.
- Europe > Ireland (0.04)
- Europe > Middle East > Cyprus (0.04)
- Europe > Germany (0.04)
- (4 more...)
- Research Report > Experimental Study (1.00)
- Overview (1.00)
- Research Report > New Finding (0.67)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Government > Regional Government > Europe Government (0.47)
Evolving and Executing Research Plans via Double-Loop Multi-Agent Collaboration
Zhang, Zhi, Liu, Yan, Hu, Zhejing, Chen, Gong, Zhong, Sheng-hua, Cao, Jiannong
Automating the end-to-end scientific research process poses a fundamental challenge: it requires both evolving high-level plans that are novel and sound, and executing these plans correctly amidst dynamic and uncertain conditions. To address this bilevel challenge, we propose a novel Double-Loop Multi-Agent (DLMA) framework to solve the given research problem automatically. The leader loop, composed of professor agents, is responsible for evolving research plans. It employs an evolutionary algorithm through involvement, improvement, and integration meetings to iteratively generate and refine a pool of research proposals, exploring the solution space effectively. The follower loop, composed of doctoral student agents, is responsible for executing the best-evolved plan. It dynamically adjusts the plan during implementation via pre-hoc and post-hoc meetings, ensuring each step (e.g., drafting, coding) is well-supported by contextual and external observations. Extensive experiments on benchmarks like ACLAward and Laboratory show that DLMA generates research papers that achieve state-of-the-art scores in automated evaluation, significantly outperforming strong baselines. Ablation studies confirm the critical roles of both loops, with evolution driving novelty and execution ensuring soundness.
- Asia > China > Guangdong Province > Shenzhen (0.05)
- Asia > China > Hong Kong (0.04)
- North America > United States > Illinois (0.04)
- Overview (0.93)
- Research Report > Promising Solution (0.46)
Google-owner reveals 5bn AI investment in UK ahead of Trump visit
The world's fourth biggest company, Google-owner Alphabet, has announced a new £5bn ($6.8bn) investment in UK artificial intelligence (AI). The money will be used for infrastructure and scientific research over the next two years - the first of several massive US investments being unveiled ahead of US President Donald Trump's state visit. Google's President and Chief Investment Officer Ruth Porat told BBC News in an exclusive interview that there were profound opportunities in the UK for its pioneering work in advanced science. The company will officially open a vast $1bn (£735m) data centre in Waltham Cross, Hertfordshire, with Chancellor Rachel Reeves on Tuesday. The investment will expand this site and also include funding for London-based DeepMind, run by British Nobel Prize winner Sir Demis Hassabis, which deploys AI to revolutionise advanced scientific research.
- Europe > United Kingdom > England > Hertfordshire (0.25)
- North America > Central America (0.15)
- Oceania > Australia (0.06)
- (14 more...)
- Government > Regional Government > North America Government > United States Government (1.00)
- Energy (1.00)
- Banking & Finance (1.00)
- Information Technology (0.94)
How the internet and its bots are sabotaging scientific research
There was a time, just a couple of decades ago, when researchers in psychology and health always had to engage with people face-to-face or using the telephone. The worst case scenario was sending questionnaire packs out to postal addresses and waiting for handwritten replies. So we either literally met our participants, or we had multiple corroborating points of evidence that indicated we were dealing with a real person who was, therefore, likely to be telling us the truth about themselves. Since then, technology has done what it always does – creating opportunities for us to cut costs, save time and access wider pools of participants on the internet. But what most people have failed to fully realise is that internet research has brought along risks of data corruption or impersonation which could be deliberately aiming to put research projects in jeopardy.
- Information Technology > Communications > Networks (0.63)
- Information Technology > Artificial Intelligence (0.50)
Neanderthals bred with humans 100,000 YEARS earlier than first thought, scientists say - as they discover skeleton of five-year-old crossbreed
Neanderthals bred with our human ancestors 100,000 years earlier than previously thought, according to a new study. Experts have discovered that a five–year–old child who lived 140,000 years ago had parents from both species. Their fossil – likely a female – was first unearthed 90 years ago in the Skhul Cave on Mount Carmel in what is now northern Israel. A team from Tel Aviv University and the French Centre for Scientific Research conducted a series of advanced tests on the remaining bones, including a CT scan of the skull. 'Genetic studies over the past decade have shown that these two groups exchanged genes,' said lead author Professor Israel Hershkovitz.
- North America > United States > Indiana > Hamilton County > Carmel (0.25)
- Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.25)
- Africa (0.06)
- (3 more...)